Regularized discriminative clustering
نویسندگان
چکیده
A generative distributional clustering model for continuous data is reviewed and methods for optimizing and regularizing it are introduced and compared. Based on pairs of auxiliary and primary data, the primary data space is partitioned into Voronoi regions that are maximally homogeneous in terms of auxiliary data. Then only variation in the primary data associated with variation in the auxiliary data influences the clusters. Because the whole primary space is partitioned, new samples can be easily clustered in terms of primary data alone. In experiments, the approach is shown to produce more homogeneous clusters than alternative methods. Two regularization methods are demonstrated to further improve the results: An entropy-type penalty for unequal cluster sizes, and the inclusion of a K-means component to the model. The latter can alternatively be interpreted as special kind of joint distribution modeling where the emphasis between discrimination and unsupervised modeling of primary data can be tuned.
منابع مشابه
Subspace Clustering via Graph Regularized Sparse Coding
Sparse coding has gained popularity and interest due to the benefits of dealing with sparse data, mainly space and time efficiencies. It presents itself as an optimization problem with penalties to ensure sparsity. While this approach has been studied in the literature, it has rarely been explored within the confines of clustering data. It is our belief that graph-regularized sparse coding can ...
متن کاملDiscriminative Clustering by Regularized Information Maximization
Is there a principled way to learn a probabilistic discriminative classifier from an unlabeled data set? We present a framework that simultaneously clusters the data and trains a discriminative classifier. We call it Regularized Information Maximization (RIM). RIM optimizes an intuitive information-theoretic objective function which balances class separation, class balance and classifier comple...
متن کاملA Joint Optimization Framework of Sparse Coding and Discriminative Clustering
Many clustering methods highly depend on extracted features. In this paper, we propose a joint optimization framework in terms of both feature extraction and discriminative clustering. We utilize graph regularized sparse codes as the features, and formulate sparse coding as the constraint for clustering. Two cost functions are developed based on entropy-minimization and maximum-margin clusterin...
متن کاملMulti-view Feature Learning with Discriminative Regularization
More and more multi-view data which can capture rich information from heterogeneous features are widely used in real world applications. How to integrate different types of features, and how to learn low dimensional and discriminative information from high dimensional data are two main challenges. To address these challenges, this paper proposes a novel multi-view feature learning framework, wh...
متن کاملLocal Learning Regularized Nonnegative Matrix Factorization
Nonnegative Matrix Factorization (NMF) has been widely used in machine learning and data mining. It aims to find two nonnegative matrices whose product can well approximate the nonnegative data matrix, which naturally lead to parts-based representation. In this paper, we present a local learning regularized nonnegative matrix factorization (LLNMF) for clustering. It imposes an additional constr...
متن کامل`2,1 Norm and Hessian Regularized Non-Negative Matrix Factorization with Discriminability for Data Representation
Matrix factorization based methods have widely been used in data representation. Among them, Non-negative Matrix Factorization (NMF) is a promising technique owing to its psychological and physiological interpretation of spontaneously occurring data. On one hand, although traditional Laplacian regularization can enhance the performance of NMF, it still suffers from the problem of its weak extra...
متن کامل